Fortunately, there is an R package designed to query the Forest Inventory and Analysis database. This package requires that JAGS be installed as well (a good excuse to learn some Bayesian modeling, I suppose!). The creators of the package also have a comprehensive website on how to use the package: https://rfia.netlify.app/ Data can also be obtained from the FIA DataMart https://apps.fs.usda.gov/fia/datamart/CSV/datamart_csv.html, but the rFIA package queries this source directly in addition to the various estimates related to the FIA analysis. The FIA user guide is here: https://www.fia.fs.fed.us/library/database-documentation/current/ver72/FIADB%20User%20Guide%20P2_7-2_final.pdf, and lists the different tables and variables in the tables, among many other items relevant to the data set.
library(rFIA)
library(dplyr)
library(ggplot2)
library(rpart)
library(patchwork)
Let’s look a the forests of Colorado (which is about 91 MB -which is not so bad compared to the climate model data-sets!). Uncomment to download the tables:
## Uncomment to download data
# COforest <- getFIA(states = "CO", dir = getwd())
## Uncomment to read in existing data
COforest <- readFIA("C:/Users/jddru/Desktop/ASAENVR/ExploringFIA")
Let’s take a peak at the TREE table, which is the largest table in the download, and has the relevant measurements for a sample of trees in a plot.
CO.TREE <- COforest$TREE
CO.PLOT <- COforest$PLOT
CO.TREE
The TREE table contains a good deal of the relevant information regarding the trees in a plot area. We also may want to look at other tables too, which may contain relevant information. In order to get the location (in geographic coordinates) of a plot, this table must be joined with PLOT table where the PLT_CN is a foreign key linking the tables together. As we can see, there are over 200 different variables in the TREE table.
Mostly for practice, let’s select a subset of variables from the TREE table and the geographic information from PLOT, and join the tables, to make it more manageable:
CO.tree <- CO.TREE %>% select(
## Grab ID and location/plot information...
CN, PLT_CN, INVYR, UNITCD, COUNTYCD, PLOT, SUBP, TREE, CONDID, AZIMUTH, DIST,
## Relavent tree measurment...
STATUSCD, SPCD, DIA, DIAHTCD, HT, HTCD, ACTUALHT, TREECLCD, CR, CCLCD,
## Enviromental factors
AGENTCD, CULL, DECAYCD)
CO.plot <- CO.PLOT %>% select(
## Grab ID and location information..
CN, LAT, LON, ELEV
)
## Join the subsets of tables together
CO.forest <- full_join(CO.tree, CO.plot, by = c("PLT_CN" = "CN"))
CO.forest
And a summary:
summary(CO.forest)
CN PLT_CN INVYR UNITCD COUNTYCD
Min. :3.585e+12 Min. :3.584e+12 Min. :1984 Min. :1.000 Min. : 1.00
1st Qu.:5.333e+12 1st Qu.:5.327e+12 1st Qu.:2004 1st Qu.:2.000 1st Qu.: 45.00
Median :3.947e+13 Median :3.537e+13 Median :2009 Median :3.000 Median : 67.00
Mean :2.013e+14 Mean :8.444e+13 Mean :2008 Mean :2.888 Mean : 66.14
3rd Qu.:4.087e+14 3rd Qu.:1.888e+14 3rd Qu.:2015 3rd Qu.:4.000 3rd Qu.: 93.00
Max. :7.565e+14 Max. :4.319e+14 Max. :2019 Max. :5.000 Max. :125.00
NA's :18363 NA's :18363 NA's :18363 NA's :18363
PLOT SUBP TREE CONDID AZIMUTH DIST
Min. : 1 Min. : 1.000 Min. : 1.000 Min. :1.000 Min. : 1.0 Min. : 0.1
1st Qu.:81700 1st Qu.: 1.000 1st Qu.: 3.000 1st Qu.:1.000 1st Qu.: 91.0 1st Qu.: 7.9
Median :84896 Median : 2.000 Median : 7.000 Median :1.000 Median :181.0 Median :15.2
Mean :76447 Mean : 2.483 Mean : 9.224 Mean :1.053 Mean :180.4 Mean :14.2
3rd Qu.:87974 3rd Qu.: 3.000 3rd Qu.: 12.000 3rd Qu.:1.000 3rd Qu.:270.0 3rd Qu.:20.1
Max. :91320 Max. :10.000 Max. :409.000 Max. :4.000 Max. :360.0 Max. :56.0
NA's :18363 NA's :18363 NA's :18363 NA's :18363 NA's :18700 NA's :18699
STATUSCD SPCD DIA DIAHTCD HT HTCD
Min. :0.000 Min. : 15 Min. : 1.00 Min. :1.000 Min. : 1.00 Min. :1.00
1st Qu.:1.000 1st Qu.: 93 1st Qu.: 5.50 1st Qu.:1.000 1st Qu.: 18.00 1st Qu.:1.00
Median :1.000 Median :108 Median : 7.40 Median :1.000 Median : 36.00 Median :1.00
Mean :1.187 Mean :292 Mean : 8.33 Mean :1.271 Mean : 36.32 Mean :1.09
3rd Qu.:1.000 3rd Qu.:746 3rd Qu.:10.50 3rd Qu.:2.000 3rd Qu.: 51.00 3rd Qu.:1.00
Max. :3.000 Max. :990 Max. :52.00 Max. :2.000 Max. :162.00 Max. :3.00
NA's :18363 NA's :18363 NA's :35417 NA's :18363 NA's :35417 NA's :35417
ACTUALHT TREECLCD CR CCLCD AGENTCD CULL
Min. : 1.00 Min. :2.00 Min. : 0.00 Min. :1.00 Min. : 0.00 Min. : 0.00
1st Qu.: 18.00 1st Qu.:2.00 1st Qu.: 30.00 1st Qu.:3.00 1st Qu.: 0.00 1st Qu.: 0.00
Median : 35.00 Median :2.00 Median : 50.00 Median :3.00 Median :12.00 Median : 0.00
Mean : 36.03 Mean :2.44 Mean : 49.88 Mean :3.15 Mean :26.36 Mean : 4.78
3rd Qu.: 50.00 3rd Qu.:3.00 3rd Qu.: 69.00 3rd Qu.:3.00 3rd Qu.:50.00 3rd Qu.: 2.00
Max. :153.00 Max. :4.00 Max. :100.00 Max. :5.00 Max. :84.00 Max. :100.00
NA's :50420 NA's :39385 NA's :83546 NA's :81645 NA's :255332 NA's :85732
DECAYCD LAT LON ELEV
Min. :1.00 Min. :36.99 Min. :-109.1 Min. : 3370
1st Qu.:1.00 1st Qu.:38.00 1st Qu.:-107.6 1st Qu.: 7650
Median :2.00 Median :38.84 Median :-106.7 Median : 9020
Mean :2.12 Mean :38.88 Mean :-106.6 Mean : 8791
3rd Qu.:3.00 3rd Qu.:39.75 3rd Qu.:-105.7 3rd Qu.:10090
Max. :5.00 Max. :41.00 Max. :-102.0 Max. :13930
NA's :265509 NA's :2
Let’s count the number of species of trees in Colorado and see which are the most prevalent:
CO.forest %>% count(SPCD, sort = TRUE) %>% mutate(percent = round(100 * n / sum(n),3))
Looking at the top 5 trees by quantity (about 63.8% of all trees in CO) we have:
Let’s filter by these five species:
CO.forestSub <- CO.forest %>% filter(SPCD %in% c(746,93,108,814,19))
CO.forestSub
Let’s make a plot of Colorado, and see where these plot locations are:
## Get location of each plot to avoid over-plotting
CO.plotLocs <- CO.forest %>%
select(PLOT, LAT, LON) %>%
distinct(PLOT, .keep_all = TRUE)
ggplot(data = CO.plotLocs, aes(x = LON, y = LAT)) +
geom_point(size = 0.2, alpha = 0.5) +
borders("county","colorado",colour="grey70") +
borders("state", size = 2) +
xlab("Longitude") + ylab("Latitude") +
ggtitle("Plot of FIA Plot Locations for Colorado") +
scale_x_continuous(breaks = seq(from = -109.5, to = 101.25, by = 1)) +
coord_fixed(ratio = 1.25, xlim = c(-108.95,-102.2), ylim = c(37,41.1))
NA
Let’s look a just a single year to start, like 2018, and plot the distribution of trees across the state:
## Filter to year 2018
forest2018 <- CO.forestSub %>% filter(INVYR == 2018)
ggplot(data = forest2018, aes(x = LON, y = LAT, color = factor(SPCD))) +
geom_point(alpha = 0.5) +
borders("county","colorado",colour="grey70") +
borders("state", size = 1) +
xlab("Longitude") + ylab("Latitude") +
ggtitle("Plot of Top 5 Tree Species in Colorado") +
scale_x_continuous(breaks = seq(from = -109.5, to = 101.25, by = 1)) +
coord_fixed(ratio = 1.25, xlim = c(-108.95,-102.2), ylim = c(37,41.1)) +
facet_wrap(~factor(SPCD))
Let’s also see if there is a dependence of elevation on these specie, i.e. do the different species lie on different strata of elevation:
ggplot(data = forest2018, aes(y = ELEV, x = factor(SPCD), fill = factor(SPCD))) +
geom_violin() +
xlab("Species ID")+
ylab("Elevation") +
ggtitle("Elevation Distribution for the Top 5 Species")
We can see that these trees are located at differing altitudes with the Engelmann Spruce (19) at the highest, and Gambel Oak (814) at the lowest.
Let’s look at species 93 (Engelmann Spruce) attributes across time and see if anything interesting is apparent in the subset of the TREE table:
CO.ESpruce <- CO.forestSub %>% filter(SPCD == 93)
CO.ESpruce
The STATUSCD variable gives information about whether the sample tree is live, cut, or dead at the time of measurement. Let’s see how this changes over time for the lodgepole pines. From the documentation we have:
ESpruce.time <- CO.ESpruce %>% group_by(INVYR) %>% count(STATUSCD)
ESpruce.time
Let’s also look at totals across status and year:
CO.ESpruce %>% group_by(STATUSCD) %>% count()
total.ESpruce <- CO.ESpruce %>% group_by(INVYR) %>% count()
total.ESpruce
Let’s plot this:
ggplot(data = ESpruce.time, aes(x = INVYR, y = n, color = factor(STATUSCD))) +
geom_line() +
geom_point()
Let’s generate a similar plot for the quaking aspen (which has seen a dieback since the 1990s):
## Filter for aspen
CO.Aspen <- CO.forestSub %>% filter(SPCD == 746)
## Group by the STATUSCD
Aspen.time <- CO.Aspen %>% group_by(INVYR) %>% count(STATUSCD)
## Look at totals by status table:
CO.Aspen %>% group_by(STATUSCD) %>% count()
## Look at the total numbers across years table:
total.Aspen <- CO.Aspen %>% group_by(INVYR) %>% count()
total.Aspen
Once again, let’s look at a plot:
ggplot(data = Aspen.time, aes(x = INVYR, y = n, color = factor(STATUSCD))) +
geom_line() +
geom_point()
In both cases, we can see that overall number of pines and aspens has decreased, and the number of recorded dead trees has increases over time.
As an aside, we can also look at different variables associated with trees to study their characteristics. For example we can look at tree height (it is reasonable to treat tree height as a proxy for age, since it is not unreasonable to assume older trees are taller) vs. elevation, latitude, and longitude, for the five selected species in the year 2018, and color the points based on the STATUSCD variable to see if an groupings/clusterings emerge, based on spatial variables. (I’ve been putting the plots into a new window, to perform the ’ol jewelers loupe inspection on these). First for elevation:
ggplot(data = forest2018, aes(x = ELEV, y = HT, color = factor(STATUSCD))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(STATUSCD))) +
facet_wrap(~(factor(SPCD))) +
ggtitle("Height vs. Elevation")
Recall 1 in green are live trees, and 2 in blue are dead trees.
The Engelmann spruce (93) seems to have more dead trees clustered at higher elevations (>10,000ft or so), and the others are more or less scattered. We do see an increase in dead trees for the lodgpole pine (108) at it’s lower elevations (between 8,000-9,500 ft). Height doesn’t seem to have a strong impact on the proportion of dead vs. alive trees.
For Latitude:
ggplot(data = forest2018, aes(x = LAT, y = HT, color = factor(STATUSCD))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(STATUSCD))) +
facet_wrap(~(factor(SPCD))) +
ggtitle("Height vs. Latitude")
Once again the spruce and pine (93, and 108) so some clustering based on latitude, namely there are more dead spruce in the southern parts of the state, and there are more dead pines in the norther part. There does seem to be a cluster of dead pines at about 37.5 degrees, and there may be a marginal impact of height on the number of dead pines.
For Longitude:
ggplot(data = forest2018, aes(x = LON, y = HT, color = factor(STATUSCD))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(STATUSCD))) +
facet_wrap(~(factor(SPCD))) +
ggtitle("Height vs. Longitude")
The spruce shows a cluster of dead trees in the middle-west part of the state (along -107 degrees), whereas all the other species are scattered. Height doesn’t seem to have much of an impact.
For diameter:
ggplot(data = forest2018, aes(x = DIA, y = HT, color = factor(STATUSCD))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(STATUSCD))) +
facet_wrap(~(factor(SPCD))) +
ggtitle("Height vs. Diameter")
Here, we can see that there is a slight impact of height on the status of the tree, namely taller trees are more often found dead that shorter trees, in particular for the pine and spruce. The exception to this is the Gambel oak, which tends to be a shorter tree. There is a strong linear relationship between the height and the diameter of the tree with strong non-constant variance as diameter and height increase.
The lodgepole pine and Engelmann spruce seems to have the most death records, and show some degree of spatial pattern in the plots/observations above. For these species it may be interesting to to see the evolution of the STATUSCD over time. Filtering by species 93 and by status code 2, which corresponds to dead Engelmann Spruce, we can make similar spatial plots as above:
ESpruce.timeDead <- CO.ESpruce %>% filter(STATUSCD == 2)
Looking at longitude vs. latitude:
ggplot(data = ESpruce.timeDead, aes(x = LON, y = LAT, color = factor(INVYR))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(INVYR))) +
facet_wrap(~(factor(INVYR)))
And height vs. longitude:
ggplot(data = ESpruce.timeDead, aes(x = LAT, y = HT, color = factor(INVYR))) +
geom_point(size = 1, alpha = 0.25, aes(color = factor(INVYR))) +
facet_wrap(~(factor(INVYR)))
It may be better to visualize these plots as 2-D density plots (the lat/lon plot in particular). We also may want to look beyond just Colorado for some spatial patterns (e.g. across the entire inter-mountain west).
Part of the rFIA package are a whole host of unctions to estimate various forest parameters. The website listed in the introduction has descriptions of these. In their tutorial they provide instructions for determining abundance estimates. We shall see that the analysis below describes the same sort of behavior in the analysis above, namely the number of trees are decreasing across the the state and the mortality is increasing. We also observe that some trees are more impacted by mortality that others.
Since we seem interested in time dependent behavior, lets construct a time series plot for the abundance (takes a moment to run):
## All inventory years available and dissaggregate by species and size
tpaCO <- tpa(COforest, variance = TRUE)
tpaCO_species <- tpa(COforest, bySpecies = TRUE)
tpaCO_sizeClass <- tpa(COforest, bySizeClass = TRUE)
Overall time-series plot:
plotFIA(tpaCO, y = BAA,
plot.title = "Colorado Abundance Time Series")
We see an overall decreasing trend by year. One advantage to use these functions, is we can obtain the EVALIDator (FIA’s flagship estimation tool) to obtain estimates of uncertainty:
tpaCO %>% select(YEAR, BAA, BAA_PERC_VAR, N)
We can also dis-aggregate based on the species, just as before:
plotFIA(tpaCO_species, y = BAA, grp = COMMON_NAME,
n.max = 10,
plot.title = "Colorado Abundance By Species")
We can also estimate the biomass of standing trees:
CO_bio <- biomass(COforest, variance = TRUE)
CO_bio
Plotting the net volume per acre:
plotFIA(CO_bio, y = NETVOL_ACRE, plot.title = "Colorado Biomass Estimates")
Mortality and harvest rates:
CO_mort <- growMort(COforest, variance = TRUE)
Recruitment data unavailable for: . Returning 0 for all recruitment estimates which include these states.
CO_mort
Let’s look at a plot of the mortality rates:
plotFIA(CO_mort, y = MORT_TPA, plot.title = "Colorado Mortality Estimates")
From an environmental/greenhouse gas perspective, it may be interesting to examine the estimate of carbon stocks:
CO_carbon <- carbon(COforest, variance = TRUE)
CO_carbon
Plotting this yields:
plotFIA(CO_carbon, y = CARB_ACRE,
grp = POOL,
plot.title = "Colorado Carbon Abundance Estimates")
We could carry out the same procedure for examining bio-diversity, invasive species indices, seedling abundance, etc… The nice thing about using these functions, is that we can easily access these estimates for modeling with other data, and generate confidence intervals for these estimates, in the event that we’d like to perform a statistical test or make inferences about the population.
In the above analysis, we do observe some trends in the data. It may be germane to investigate some modeling in order to understand the overall structrue of the data. This will also enable us to include the climate data and observe and associations, should they exist, between the forest data and the climate data. In order to proceed, I’ll make use of the estimate functions explored prevously. We can obtain estimates at the plot level in order to examine any spatial variation.
Lets join these tables (all but invasive species since this mostly consists of text/categorical information -we could one-hot encode this for use later):
COtrees <- inner_join(tpaCO_plot, bioCO_plot, by = 'PLT_CN') %>%
inner_join(., mortCO_plot, by = 'PLT_CN') %>%
inner_join(., carbCO_plot, by = 'PLT_CN') %>%
inner_join(., seedCO_plot, by = 'PLT_CN') %>%
inner_join(., vitalCO_plot, by = 'PLT_CN')
COtrees
There are many variables, it it is becoming a little cumbersome to plot and control for variables in plots/tables. In order to understand the behavior of the trees in Colorado, it may be useful to fit a subset of the variables via a regression tree. Namely, we would like predict or see what variables influence the STATUSCD variable. To begin let’s try just a single year, such as 2018. Let’s select a subset of variables relevant to this (i.e. exclude identification variables, variables with too many missing values etc…)
## Select subset
forest2018.sub <- forest2018 %>%
select(STATUSCD, SPCD, DIA, HT, TREECLCD, CR, CCLCD, CULL, LAT, LON, ELEV)
## Recode some variables as factors
forest2018.sub$STATUSCD <- factor(forest2018.sub$STATUSCD)
forest2018.sub$SPCD <- factor(forest2018.sub$SPCD)
forest2018.sub$TREECLCD <- factor(forest2018.sub$TREECLCD)
forest2018.sub$CCLCD <- factor(forest2018.sub$CCLCD)
## Fit a regression tree (pun intended)
CO.TreeMod2018 <- rpart(STATUSCD ~ .,
forest2018.sub,
method = "class")
plot(CO.TreeMod2018, compress = T, uniform = T, branch = 0.4, main = "Reg Tree")
text(CO.TreeMod2018)
This is maybe a bit dissapointing, but let’s try the same approach with k-means clustering:
set.seed(1)
kmeans(na.omit(forest2018.sub), 5)
K-means clustering with 5 clusters of sizes 384, 608, 1199, 1192, 1186
Cluster means:
STATUSCD SPCD DIA HT TREECLCD CR CCLCD CULL LAT LON
1 1 658.4505 7.543229 34.83073 2.460938 43.24740 3.143229 5.8723958 39.23088 -107.1375
2 1 134.0872 8.912007 48.44079 2.027961 58.51645 3.228618 0.5263158 38.86642 -106.5323
3 1 121.7923 8.496163 49.74812 2.005004 52.22936 3.192661 0.6605505 39.25637 -106.6639
4 1 228.8993 8.335990 48.88339 2.010906 48.85067 3.197148 1.2147651 39.26438 -106.5696
5 1 552.2066 8.985666 51.39629 2.155143 37.57251 3.144182 6.2925801 39.25227 -106.9445
ELEV
1 7854.479
2 11290.312
3 10459.875
4 9650.394
5 8877.698
Clustering vector:
1 2 4 5 6 7 8 10 11 12 13 14 15 16 17 21 24 25 26 27
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
29 30 31 32 33 34 35 36 37 39 41 42 43 44 45 46 47 48 49 50
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
51 52 53 54 55 56 57 58 59 62 64 67 68 69 70 71 72 73 75 76
5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2
78 79 80 81 82 83 84 85 86 87 89 90 91 92 93 94 96 98 103 104
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 5 5
105 106 107 109 110 111 112 115 118 120 121 124 125 126 127 129 130 131 132 133
5 5 4 4 4 4 4 4 4 5 4 5 5 4 5 5 5 5 5 4
134 135 136 137 140 142 143 144 145 146 147 148 149 151 152 153 158 160 161 162
5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
165 166 167 168 169 170 171 172 173 174 175 177 178 179 180 181 182 183 184 185
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
186 187 188 189 192 194 196 201 202 204 205 209 211 212 213 217 218 219 220 221
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4
222 223 224 225 226 228 229 231 232 233 234 235 236 237 238 239 240 241 242 243
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
245 246 247 248 249 250 251 252 253 254 255 256 266 267 268 269 270 272 273 274
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
276 277 279 281 282 285 286 287 288 289 290 291 292 293 294 295 300 304 305 306
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
307 308 309 310 311 312 315 331 332 333 334 341 342 343 344 346 347 349 352 353
4 4 4 4 4 4 1 5 5 5 5 5 5 5 5 5 5 5 5 5
354 355 356 358 359 360 361 362 363 364 365 366 367 368 370 371 372 373 374 377
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
398 399 400 403 405 406 407 408 409 410 411 412 413 414 417 418 419 420 421 422
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
423 434 447 448 449 450 451 452 453 455 456 460 461 466 467 469 470 471 472 475
5 5 5 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1
511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
531 532 533 534 536 537 538 539 540 541 542 543 544 545 547 548 549 550 551 552
3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4
553 554 555 556 557 561 562 564 566 567 568 569 577 578 581 586 588 589 590 591
4 4 4 4 4 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
595 596 597 599 600 604 610 617 622 625 627 628 630 631 639 640 641 644 645 652
2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
658 659 660 661 662 663 664 665 666 669 671 672 673 674 675 676 677 678 679 680
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4
681 682 683 684 685 686 687 688 689 692 693 694 695 696 697 698 699 700 701 702
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
703 704 705 706 707 708 710 711 712 713 714 715 716 718 719 720 721 726 727 728
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3
729 730 731 733 735 736 737 738 739 741 742 743 744 745 747 748 753 756 764 765
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
767 768 769 770 771 772 773 776 777 779 781 782 783 784 786 793 798 799 800 801
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
803 804 805 806 807 808 810 811 812 813 817 818 828 829 830 831 838 839 840 841
3 3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5
847 851 858 861 862 865 866 868 869 870 897 909 914 915 916 920 924 927 930 932
5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4
933 937 939 942 951 952 953 954 955 956 961 964 965 969 971 974 975 978 979 980
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
981 982 984 986 988 989 990 991 992 993 994 995 997 999 1001 1002 1003 1004 1005 1006
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
1007 1008 1009 1012 1013 1014 1015 1019 1020 1021 1022 1024 1025 1026 1027 1028 1029 1030 1031 1032
4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1051 1052 1054 1055
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1056 1058 1059 1060 1061 1065 1069 1070 1071 1072 1073 1074 1075 1076 1077 1081 1082 1093 1094 1097
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1099 1100 1101 1103 1106 1112 1119 1121 1122 1130 1132 1133 1150 1152 1170 1171 1172 1176 1181 1182
3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 1 1 1 1 1
1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1196 1198 1199 1201 1202 1203 1206 1209 1210
1 1 1 1 1 1 1 1 1 1 1 1 5 5 5 5 5 5 5 5
1211 1213 1214 1215 1216 1218 1219 1220 1221 1222 1224 1225 1226 1227 1228 1229 1230 1233 1235 1236
5 5 5 5 5 5 5 3 3 3 3 3 3 3 3 3 3 3 3 3
1238 1239 1240 1241 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1275 1276 1279 1282 1283
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1284 1285 1286 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1314 1315 1316 1318 1319
3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4
1323 1328 1329 1330 1331 1332 1333 1335 1336 1337 1338 1339 1340 1341 1343 1344 1345 1346 1347 1348
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
1349 1350 1351 1352 1353 1354 1355 1361 1365 1369 1370 1371 1372 1374 1375 1376 1389 1390 1391 1392
4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 2 2 2 2
1395 1396 1401 1403 1405 1406 1407 1408 1409 1412 1418 1420 1422 1424 1425 1427 1428 1429 1430 1432
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1433 1436 1437 1438 1440 1441 1442 1443 1444 1445 1446 1452 1453 1455 1456 1457 1464 1467 1475 1476
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3
1484 1485 1486 1487 1491 1492 1493 1499 1503 1504 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
1516 1517 1518 1520 1521 1523 1524 1525 1526 1527 1528 1529 1537 1539 1540 1545 1546 1549 1550 1551
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2
1552 1553 1554 1555 1556 1557 1559 1595 1596 1597 1598 1599 1600 1602 1606 1607 1608 1609 1610 1611
2 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4
1612 1613 1614 1615 1616 1617 1619 1621 1622 1625 1626 1627 1631 1633 1634 1635 1636 1683 1685 1687
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5
1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1705 1706 1707 1708 1709
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
1710 1711 1712 1713 1714 1715 1716 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1734
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
1735 1736 1737 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1753 1841 1855 1857 1858 1859
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4
1860 1862 1863 1864 1865 1866 1867 1870 1871 1873 1876 1877 1889 1908 1909 1912 1913 1920 1921 1922
4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1
[ reached getOption("max.print") -- omitted 3569 entries ]
Within cluster sum of squares by cluster:
[1] 127955691 72562651 109469929 158932707 194937590
(between_SS / total_SS = 87.2 %)
Available components:
[1] "cluster" "centers" "totss" "withinss" "tot.withinss" "betweenss"
[7] "size" "iter" "ifault"